Goto

Collaborating Authors

 Peterborough County


Estranged Predictions: Measuring Semantic Category Disruption with Masked Language Modelling

Liu, Yuxuan, Dubossarsky, Haim, Ahnert, Ruth

arXiv.org Artificial Intelligence

This paper examines how science fiction destabilises ontological categories by measuring conceptual permeability across the terms human, animal, and machine using masked language modelling (MLM). Drawing on corpora of science fiction (Gollancz SF Masterworks) and general fiction (NovelTM), we operationalise Darko Suvin's theory of estrangement as computationally measurable deviation in token prediction, using RoBERTa to generate lexical substitutes for masked referents and classifying them via Gemini. We quantify conceptual slippage through three metrics: retention rate, replacement rate, and entropy, mapping the stability or disruption of category boundaries across genres. Our findings reveal that science fiction exhibits heightened conceptual permeability, particularly around machine referents, which show significant cross-category substitution and dispersion. Human terms, by contrast, maintain semantic coherence and often anchor substitutional hierarchies. These patterns suggest a genre-specific restructuring within anthropocentric logics. We argue that estrangement in science fiction operates as a controlled perturbation of semantic norms, detectable through probabilistic modelling, and that MLMs, when used critically, serve as interpretive instruments capable of surfacing genre-conditioned ontological assumptions. This study contributes to the methodological repertoire of computational literary studies and offers new insights into the linguistic infrastructure of science fiction.


Task-Aware Reduction for Scalable LLM-Database Systems

Barnes, Marcus Emmanuel, Ghaleb, Taher A., Hassan, Safwat

arXiv.org Artificial Intelligence

Large Language Models (LLMs) are increasingly applied to data-intensive workflows, from database querying to developer observability. Yet the effectiveness of these systems is constrained by the volume, verbosity, and noise of real-world text-rich data such as logs, telemetry, and monitoring streams. Feeding such data directly into LLMs is costly, environmentally unsustainable, and often misaligned with task objectives. Parallel efforts in LLM efficiency have focused on model- or architecture-level optimizations, but the challenge of reducing upstream input verbosity remains underexplored. In this paper, we argue for treating the token budget of an LLM as an attention budget and elevating task-aware text reduction as a first-class design principle for language -- data systems. We position input-side reduction not as compression, but as attention allocation: prioritizing information most relevant to downstream tasks. We outline open research challenges for building benchmarks, designing adaptive reduction pipelines, and integrating token-budget--aware preprocessing into database and retrieval systems. Our vision is to channel scarce attention resources toward meaningful signals in noisy, data-intensive workflows, enabling scalable, accurate, and sustainable LLM--data integration.


WeatherArchive-Bench: Benchmarking Retrieval-Augmented Reasoning for Historical Weather Archives

Yu, Yongan, Du, Xianda, Hu, Qingchen, Liang, Jiahao, Ni, Jingwei, Qiang, Dan, Huang, Kaiyu, McKenzie, Grant, Sieber, Renee, Mo, Fengran

arXiv.org Artificial Intelligence

Historical archives on weather events are collections of enduring primary source records that offer rich, untapped narratives of how societies have experienced and responded to extreme weather events. These qualitative accounts provide insights into societal vulnerability and resilience that are largely absent from meteorological records, making them valuable for climate scientists to understand societal responses. However, their vast scale, noisy digitized quality, and archaic language make it difficult to transform them into structured knowledge for climate research. To address this challenge, we introduce WeatherArchive-Bench, the first benchmark for evaluating retrieval-augmented generation (RAG) systems on historical weather archives. WeatherArchive-Bench comprises two tasks: WeatherArchive-Retrieval, which measures a system's ability to locate historically relevant passages from over one million archival news segments, and WeatherArchive-Assessment, which evaluates whether Large Language Models (LLMs) can classify societal vulnerability and resilience indicators from extreme weather narratives. Extensive experiments across sparse, dense, and re-ranking retrievers, as well as a diverse set of LLMs, reveal that dense retrievers often fail on historical terminology, while LLMs frequently misinterpret vulnerability and resilience concepts. These findings highlight key limitations in reasoning about complex societal indicators and provide insights for designing more robust climate-focused RAG systems from archival contexts. The constructed dataset and evaluation framework are publicly available at https://anonymous.4open.science/r/WeatherArchive-Bench/.


Bridging Minds and Machines: Toward an Integration of AI and Cognitive Science

Mao, Rui, Liu, Qian, Li, Xiao, Cambria, Erik, Hussain, Amir

arXiv.org Artificial Intelligence

Cognitive Science has profoundly shaped disciplines such as Artificial Intelligence (AI), Philosophy, Psychology, Neuroscience, Linguistics, and Culture. Many breakthroughs in AI trace their roots to cognitive theories, while AI itself has become an indispensable tool for advancing cognitive research. This reciprocal relationship motivates a comprehensive review of the intersections between AI and Cognitive Science. By synthesizing key contributions from both perspectives, we observe that AI progress has largely emphasized practical task performance, whereas its cognitive foundations remain conceptually fragmented. We argue that the future of AI within Cognitive Science lies not only in improving performance but also in constructing systems that deepen our understanding of the human mind. Promising directions include aligning AI behaviors with cognitive frameworks, situating AI in embodiment and culture, developing personalized cognitive models, and rethinking AI ethics through cognitive co-evaluation.


Website visits can predict angler presence using machine learning

Schmid, Julia S., Simmons, Sean, Lewis, Mark A., Poesch, Mark S., Ramazi, Pouria

arXiv.org Artificial Intelligence

Understanding and predicting recreational fishing activity is important for sustainable fisheries management. However, traditional methods of measuring fishing pressure, such as surveys, can be costly and limited in both time and spatial extent. Predictive models that relate fishing activity to environmental or economic factors typically rely on historical data, which often restricts their spatial applicability due to data scarcity. In this study, high-resolution angler-generated data from an online platform and easily accessible auxiliary data were tested to predict daily boat presence and aerial counts of boats at almost 200 lakes over five years in Ontario, Canada. Lake-information website visits alone enabled predicting daily angler boat presence with 78% accuracy. While incorporating additional environmental, socio-ecological, weather and angler-generated features into machine learning models did not remarkably improve prediction performance of boat presence, they were substantial for the prediction of boat counts. Models achieved an R2 of up to 0.77 at known lakes included in the model training, but they performed poorly for unknown lakes (R2 = 0.21). The results demonstrate the value of integrating angler-generated data from online platforms into predictive models and highlight the potential of machine learning models to enhance fisheries management.


Frustratingly Simple Prompting-based Text Denoising

Park, Jungyeul, Qiu, Mengyang

arXiv.org Artificial Intelligence

This paper introduces a novel perspective on the automated essay scoring (AES) task, challenging the conventional view of the ASAP dataset as a static entity. Employing simple text denoising techniques using prompting, we explore the dynamic potential within the dataset. While acknowledging the previous emphasis on building regression systems, our paper underscores how making minor changes to a dataset through text denoising can enhance the final results. Text denoising is a crucial step in natural language processing (NLP) and text analysis tasks (Sun & Jiang, 2019; Xian et al., 2021). One of its major applications is in optical character recognition (OCR).


Enhancing Financial Data Visualization for Investment Decision-Making

Patel, Nisarg, Shah, Harmit, Mewada, Kishan

arXiv.org Artificial Intelligence

Navigating the intricate landscape of financial markets requires adept forecasting of stock price movements. This paper delves into the potential of Long Short-Term Memory (LSTM) networks for predicting stock dynamics, with a focus on discerning nuanced rise and fall patterns. Leveraging a dataset from the New York Stock Exchange (NYSE), the study incorporates multiple features to enhance LSTM's capacity in capturing complex patterns. Visualization of key attributes, such as opening, closing, low, and high prices, aids in unraveling subtle distinctions crucial for comprehensive market understanding. The meticulously crafted LSTM input structure, inspired by established guidelines, incorporates both price and volume attributes over a 25-day time step, enabling the model to capture temporal intricacies. A comprehensive methodology, including hyperparameter tuning with Grid Search, Early Stopping, and Callback mechanisms, leads to a remarkable 53% improvement in predictive accuracy. The study concludes with insights into model robustness, contributions to financial forecasting literature, and a roadmap for real-time stock market prediction. The amalgamation of LSTM networks, strategic hyperparameter tuning, and informed feature selection presents a potent framework for advancing the accuracy of stock price predictions, contributing substantively to financial time series forecasting discourse.


Multi-User MultiWOZ: Task-Oriented Dialogues among Multiple Users

Jo, Yohan, Zhao, Xinyan, Biswas, Arijit, Basiou, Nikoletta, Auvray, Vincent, Malandrakis, Nikolaos, Metallinou, Angeliki, Potamianos, Alexandros

arXiv.org Artificial Intelligence

While most task-oriented dialogues assume conversations between the agent and one user at a time, dialogue systems are increasingly expected to communicate with multiple users simultaneously who make decisions collaboratively. To facilitate development of such systems, we release the Multi-User MultiWOZ dataset: task-oriented dialogues among two users and one agent. To collect this dataset, each user utterance from MultiWOZ 2.2 was replaced with a small chat between two users that is semantically and pragmatically consistent with the original user utterance, thus resulting in the same dialogue state and system response. These dialogues reflect interesting dynamics of collaborative decision-making in task-oriented scenarios, e.g., social chatter and deliberation. Supported by this data, we propose the novel task of multi-user contextual query rewriting: to rewrite a task-oriented chat between two users as a concise task-oriented query that retains only task-relevant information and that is directly consumable by the dialogue system. We demonstrate that in multi-user dialogues, using predicted rewrites substantially improves dialogue state tracking without modifying existing dialogue systems that are trained for single-user dialogues. Further, this method surpasses training a medium-sized model directly on multi-user dialogues and generalizes to unseen domains.


Unsupervised Dialogue Topic Segmentation with Topic-aware Utterance Representation

Gao, Haoyu, Wang, Rui, Lin, Ting-En, Wu, Yuchuan, Yang, Min, Huang, Fei, Li, Yongbin

arXiv.org Artificial Intelligence

Dialogue Topic Segmentation (DTS) plays an essential role in a variety of dialogue modeling tasks. Previous DTS methods either focus on semantic similarity or dialogue coherence to assess topic similarity for unsupervised dialogue segmentation. However, the topic similarity cannot be fully identified via semantic similarity or dialogue coherence. In addition, the unlabeled dialogue data, which contains useful clues of utterance relationships, remains underexploited. In this paper, we propose a novel unsupervised DTS framework, which learns topic-aware utterance representations from unlabeled dialogue data through neighboring utterance matching and pseudo-segmentation. Extensive experiments on two benchmark datasets (i.e., DialSeg711 and Doc2Dial) demonstrate that our method significantly outperforms the strong baseline methods. For reproducibility, we provide our code and data at:https://github.com/AlibabaResearch/DAMO-ConvAI/tree/main/dial-start.